Metagenomic ventures into outer sequence space

نویسنده

  • Bas E Dutilh
چکیده

Sequencing DNA or RNA directly from the environment often results in many sequencing reads that have no homologs in the database. These are referred to as "unknowns," and reflect the vast unexplored microbial sequence space of our biosphere, also known as "biological dark matter." However, unknowns also exist because metagenomic datasets are not optimally mined. There is a pressure on researchers to publish and move on, and the unknown sequences are often left for what they are, and conclusions drawn based on reads with annotated homologs. This can cause abundant and widespread genomes to be overlooked, such as the recently discovered human gut bacteriophage crAssphage. The unknowns may be enriched for bacteriophage sequences, the most abundant and genetically diverse component of the biosphere and of sequence space. However, it remains an open question, what is the actual size of biological sequence space? The de novo assembly of shotgun metagenomes is the most powerful tool to address this question.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Functional assignment of metagenomic data: challenges and applications

Metagenomic sequencing provides a unique opportunity to explore earth's limitless environments harboring scores of yet unknown and mostly unculturable microbes and other organisms. Functional analysis of the metagenomic data plays a central role in projects aiming to explore the most essential questions in microbiology, namely 'In a given environment, among the microbes present, what are they d...

متن کامل

A new sequence space and norm of certain matrix operators on this space

In the present paper, we introduce the sequence space [{l_p}(E,Delta) = left{ x = (x_n)_{n = 1}^infty : sum_{n = 1}^infty left|  sum_{j in {E_n}} x_j - sum_{j in E_{n + 1}} x_jright| ^p < infty right},] where $E=(E_n)$ is a partition of finite subsets of the positive integers and $pge 1$. We investigate its topological properties and inclusion relations. Moreover, we consider the problem of fin...

متن کامل

Analysis of the sorting signals directing NADH-cytochrome b5 reductase to two locations within yeast mitochondria.

Mitochondrial NADH-cytochrome b5 reductase (Mcr1p) is encoded by a single nuclear gene and imported into two different submitochondrial compartments: the outer membrane and the intermembrane space. We now show that the amino-terminal 47 amino acids suffice to target the Mcr1 protein to both destinations. The first 12 residues of this sequence function as a weak matrix-targeting signal; the rema...

متن کامل

An internal targeting signal directing proteins into the mitochondrial intermembrane space.

Import of most nucleus-encoded preproteins into mitochondria is mediated by N-terminal presequences and requires a membrane potential and ATP hydrolysis. Little is known about the chemical nature and localization of other mitochondrial targeting signals or of the mechanisms by which they facilitate membrane passage. Mitochondrial heme lyases lack N-terminal targeting information. These proteins...

متن کامل

CompostBin: A DNA Composition-Based Algorithm for Binning Environmental Shotgun Reads

A major hindrance to studies of microbial diversity has been that the vast majority of microbes cannot be cultured in the laboratory and thus are not amenable to traditional methods of characterization. Environmental shotgun sequencing (ESS) overcomes this hurdle by sequencing the DNA from the organisms present in a microbial community. The interpretation of this metagenomic data can be greatly...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2014